[MiniMax-M2] Remove reduce_results kwarg from FusedMoE init by mounikamandava · Pull Request #1444 · vllm-project/vllm-gaudi

mounikamandava · 2026-05-12T19:55:52Z

Removes the reduce_results=False argument passed to FusedMoE in HpuMiniMaxM2MoE which is no longer accepted by upstream VLLM and causes worker startup to fail.

Upstream VLLM removed the reduce_results parameter from Fused MoE_init_ (vllm/model_executor/layers/fused_moe/layer.py). THe MoE output reduction is now decided internally based on TP/EP topology. The corresponding upstream model MiniMaxM2MoE (vllm/model_executor/models/minimax_m2.py) was updated accordingly, but the HPU port HpuMiniMaxM2MoE was not, so it still passes the now-unknown kwarg.

Fix :
Drop the reduce_results=False kwarg from the FusedMoE construction in HpuMiniMaxM2MoE. Behavior is unchanged because upstream now governs MoE output reduction internally based on TP/EP configuration.

Copilot

Pull request overview

This PR updates the Gaudi-specific MiniMax-M2 MoE implementation to stay compatible with upstream vLLM by removing a no-longer-supported reduce_results keyword argument when constructing FusedMoE, preventing worker startup failures.

Changes:

Remove the deprecated reduce_results=False kwarg from FusedMoE(...) initialization in HpuMiniMaxM2MoE.

github-actions · 2026-05-13T01:44:39Z

✅ CI Passed

All checks passed successfully against the following vllm commit:
54f548e9e58087f0155e4e164e416ad7efdfde6d

iboiko-habana

reduce_results was removed in vllm-project/vllm#35949. Thanks for fix

Fix accuracy of minimax m2 for tensor parallel size > 1. Reduce is handled in FusedMoE after #1377 and `reduce_results=False` dropped #1444 **Output without this PR:** ``` curl http://localhost:8000/v1/chat/completions -H "Content-Type: application/json" -d '{ "model": "/mnt/weka/data/llm-d-models-pv/MiniMaxAI-MiniMax-M2.7", "messages": [ {"role": "user", "content": [{"type": "text", "text": "Write a quick sort algorithm in python"}]} ], "max_tokens": 200 }' ``` {"id":"chatcmpl-8eb68aec66d7f527","object":"chat.completion","created":1778891236,"prompt_routed_experts":null,"model":"/mnt/weka/data/llm-d-models-pv/MiniMaxAI-MiniMax-M2.7","choices":[{"index":0,"message":{"role":"assistant","content":"<think>I hadnet me find a programme2/apto/c- 241?._o. no (the operation.yb-b\n> ыйо, not change this;~~ I think_colour ==\"light pink\";}) in...\n**The These must be not} was\n and \n\n</think>):\n\nI('key=ельблиматš micrac_ / 1)2rasm_0.2 → add__2dict_eagle/tabString/im不过是 \\_list-ofchf_one \nCompute_with_prt_init: (New Tool Pro)\n-Main%-day__ ** [B1] : {nb_z0'];\n--own-traor: with: =: use 0.096-10_l_`this col0: 26;```\n</t_lN-蔓音频四文アنتストu+002:htt 도 원책임.(↑): The thought_dirty_s","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null},"logprobs":null,"finish_reason":"length","stop_reason":null,"token_ids":null,"routed_experts":null}],"service_tier":null,"system_fingerprint":"vllm-0.20.1rc1.dev276+g54f548e9e-tp4-ep-614b7488","usage":{"prompt_tokens":45,"total_tokens":245,"completion_tokens":200,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null} **With PR** {"id":"chatcmpl-b79acb2e48acc5d0","object":"chat.completion","created":1778891747,"prompt_routed_experts":null,"model":"/mnt/weka/data/llm-d-models-pv/MiniMaxAI-MiniMax-M2.7","choices":[{"index":0,"message":{"role":"assistant","content":"<think>We are going to write a quick sort algorithm in Python.\n We will define a function quicksort that takes a list as input.\n We will choose a pivot (commonly the last element, but we can also choose a random element or the middle).\n We will partition the list into two parts: elements less than the pivot and elements greater than the pivot.\n Then we recursively sort the two parts and combine them with the pivot in between.\n\n However, note that the problem asks for a quick sort algorithm, so we'll implement the standard in-place quick sort.\n\n Steps:\n 1. If the list has length 0 or 1, it is already sorted.\n 2. Otherwise, select a pivot (we'll use the last element for simplicity).\n 3. Partition the list into two sublists: left (elements less than pivot) and right (elements greater than or equal to pivot).\n 4. Return the sorted left part, then the pivot, then the sorted right part.\n\n Alternatively, we","refusal":null,"annotations":null,"audio":null,"function_call":null,"tool_calls":[],"reasoning":null},"logprobs":null,"finish_reason":"length","stop_reason":null,"token_ids":null,"routed_experts":null}],"service_tier":null,"system_fingerprint":"vllm-0.20.1rc1.dev276+g54f548e9e-tp4-ep-614b7488","usage":{"prompt_tokens":45,"total_tokens":245,"completion_tokens":200,"prompt_tokens_details":null},"prompt_logprobs":null,"prompt_token_ids":null,"kv_transfer_params":null} --------- Signed-off-by: Soila Kavulya <soila.p.kavulya@intel.com> Co-authored-by: Iryna Boiko <iryna.boiko@intel.com>

[MiniMax-M2] Remove reduce_results kwarg from FusedMoE init

4f9654e

Copilot AI review requested due to automatic review settings May 12, 2026 19:55

mounikamandava requested review from PatrykWo, adobrzyn, afierka-intel, iboiko-habana, jbyczkow, kamil-kaczor, ksmusz, mgawarkiewicz-intel, michalkuligowski and xuechendi as code owners May 12, 2026 19:55

Copilot started reviewing on behalf of mounikamandava May 12, 2026 19:56 View session

Copilot AI reviewed May 12, 2026

View reviewed changes

github-actions Bot mentioned this pull request May 12, 2026

🚦 Team Review Dashboard #701

Open

iboiko-habana approved these changes May 13, 2026

View reviewed changes

iboiko-habana merged commit cbc78c0 into vllm-project:main May 13, 2026
5 of 6 checks passed

skavulya mentioned this pull request May 16, 2026

Fix accuracy issue in minimax_m2 with TP > 1 #1451

Merged

This was referenced May 28, 2026

[v0.21.0] Fix accuracy issue in minimax_m2 with TP > 1 #1505

Closed

[v0.21.0] Fix accuracy issue in minimax_m2 with TP > 1 #1506

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[MiniMax-M2] Remove reduce_results kwarg from FusedMoE init#1444

[MiniMax-M2] Remove reduce_results kwarg from FusedMoE init#1444
iboiko-habana merged 1 commit into
vllm-project:mainfrom
mounikamandava:fix-minimax-m2

mounikamandava commented May 12, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions Bot commented May 13, 2026

Uh oh!

iboiko-habana left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mounikamandava commented May 12, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

github-actions Bot commented May 13, 2026

✅ CI Passed

Uh oh!

iboiko-habana left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants